Welcome to class!

About Us

Carrie Wright

Assistant Scientist, Department of Biostatistics, JHSPH

PhD in Biomedical Sciences

Email: cwrigh60@jhu.edu

Website: carriewright11.github.io

Carrie's picture

About Us

Ava Hoffman

Research Associate, Department of Biostatistics, JHSPH

PhD in Ecology

Email: ava.hoffman@jhu.edu

Website: avahoffman.com

Ava's picture

About Us

About Us - TAs

Grant Schumock

PhD Candidate, Department of Biostatistics, JHSPH

BS in Nuclear Engineering

Email: gschumo1@jhmi.edu

Grant's picture

About Us - TAs

Qier Meng

ScM Student, Department of Biostatistics, JHSPH

Bachelor’s Degree in Mathematics

Bachelor’s Degree in Neuroscience

Email: qmeng11@jhmi.edu

Qier's picture

What is R?

What is R?

Why R?

  • High level language designed for statistical computing

  • Powerful and flexible - especially for data wrangling and visualization

  • Free (open source)

  • Extensive add-on software (packages)

  • Strong community

R-Ladies - a non-profit civil society community (source: https://rladies-baltimore.github.io/)

Why not R?

  • Fairly steep learning curve

    • “Programming” oriented

    • Minimal interface

  • Little centralized support, relies on online community and package developers

  • Annoying to update

  • Slower, and more memory intensive, than the more traditional programming languages (C, Java, Perl, Python)

R hex stickers for packages

Introductions

What do you hope to get out of the class?

Why do you want to use R?

image of rocks with word hope painted on Photo by Nick Fewings on Unsplash

Course Website

Learning Objectives

  • Reading data into R
  • Recoding and manipulating data
  • Using add-on packages
  • Making exploratory plots
  • Understanding basic programming syntax
  • Performing basic statistical tests
  • Writing R functions

## Course Format

  • Lecture with slides (possibly “Interactive”)
  • Lab/Practical experience
  • Two 10 min breaks each day - timing may vary
  • Jan 10-21, 2022, 8:30AM-11:50AM on Zoom
  • No class on Jan 17th for Martin Luther King Jr. Day

## CoursePlus

CoursePlus: https://courseplus.jhu.edu/core/index.cfm/go/syl:syl.public.view/coid/16733/

Surveys throughout the class for the instructors.

End of class Survey - link in email.

Grading

  1. Attendance/Participation: 20% - this can be asynchronous - just some sort of interaction with the instructors/TAs (turning in assignments, emailing etc.)
  2. Homework: 3 x 15%
  3. Final “Project”: 35%

Homeworks and Final Project due by Wednesday, Jan 26, 2022 at 11:59pm EST.

If you turn homework in earlier this can allow us to potentially give you feedback earlier.

Note: Only people taking the course for credit must turn in the assignments. However, we will evaluate all submitted assignments in case others would like feedback on their work.

Installing R

Getting files from downloads

Basic terms

  • Package - a package in R is a bundle or “package” of code (and or possibly data) that can be loaded together for easy repeated use or for sharing with others.

Packages are sort of analogous to a software application like Microsoft Word on your computer. Your operating system allows you to use it, just like having R installed (and other required packages) allows you to use packages.

  • Function - a function is a particular piece of code that allows you to do something in R. You can write your own, use functions that come directly from installing R, or use functions from additional packages.

A function might help you add numbers together, create a plot, or organize your data. More on that soon!

Tidyverse and Base R

We will mostly show you how to use tidyverse packages and functions.

This is a newer set of packages that cane make your code more intuitive or readable.

Collection of R packages

We have an R package called jhur that will make sure all the packages are installed.

You can just copy and paste the below code into your console - we’ll explain what it all means in the next day or two

install.packages("remotes")
remotes::install_github("muschellij2/jhur")

Note it may take ~5-10 minutes to run.

Useful (+Free) Resources

Want more?
- Tidyverse Skills for Data Science Book: https://jhudatascience.org/tidyversecourse/
- Tidyverse Skills for Data Science Course (can get certificate): https://www.coursera.org/specializations/tidyverse-data-science-r
- R for Data Science: http://r4ds.had.co.nz/ - Open Case Studies: https://www.opencasestudies.org/ - Dataquest: https://www.dataquest.io/

Need help?
- Various “Cheat Sheets”: https://www.rstudio.com/resources/cheatsheets/
- R reference card: http://cran.r-project.org/doc/contrib/Short-refcard.pdf
- R terminology: https://cran.r-project.org/doc/manuals/r-release/R-lang.pdf

Interested in Reproducibility? Check out Candace’s courses at: https://jhudatascience.org/Reproducibility_in_Cancer_Informatics/ and https://jhudatascience.org/Adv_Reproducibility_in_Cancer_Informatics/